home *** CD-ROM | disk | FTP | other *** search
-
- CURRENT_MEETING_REPORT_
-
-
- Reported by Claudio Topolcic/CNRI and Bernhard Stockman/NORDUnet
-
- Minutes of the Operational Statistics Working Group (OPSTAT)
-
- Monday's Session
-
- The purpose of this meeting were:
-
-
- 1. Review the current status of the OPSTATS activities
-
- o Bernhard's papers
- o Other related efforts, specifically, Susan Estrada's BOF
-
- 2. Decide what can be progressed now and progress it
-
- o Model
- o Set of metrics (simple SNMP only)
- o Display formats
- o Simple collection, storage, and exchange
-
- 3. Define what is still left to do
-
- o MIB for new SNMP variables
- o Exchange protocol
- o More sophisticated storage formats
- o Develop publicly available collection tools
- o Display formats for weekly and instantaneous reports
-
- 4. Specific actions to be taken in this meeting were:
-
- o Decide polling period
- o Agree on what to progress
- o Edit Bernhard's papers, review on Thursday, submit as Internet
- Draft
-
-
- The model was presented for people who were new to the group. A
- fundamental part of this model is the agreement on a common minimal set
- of metrics that will be collected. It was noted that some of these may
- be difficult to obtain.
-
- It had been proposed that there would be three report formats that would
- be produced; a monthly report, a weekly report, and an instantaneous
- display. A format for the monthly report had been agreed to. It was
- described as a ``Macdonalds'' report because it would contain only total
- aggregates. It was felt that this report would support management
- activities, whereas the weekly report would support engineering
- planning, and the instantaneous display would support problem
- resolution. However, it was realized that the real distinction was not
- the time frame but the degree of aggregation of the data. The data in
-
- 1
-
-
-
-
-
- the management reports would be more aggregated that that in the
- engineering reports, regardless of the time they covered.
-
- Bernhard's documents described the data that would be collected from
- each router, both for each of the router's interface, and for the router
- itself. These are all MIB variables. It was at first assumed that the
- per interface variables were specific to IP, but it was pointed out that
- the loading data needs to be total, not IP specific, or the link loading
- could not be determined. It was also pointed out that the MIB interface
- variables are multi-protocol anyway, so there is no problem. However,
- it was also pointed out that if the router variables are IP only, then
- they do not give a measure of the router's loading.
-
- It was noted that the loading information that is important is not
- related to any interface, but to the links. Links are occasionally
- rehomed when interfaces fail. Currently, the data is processed by hand
- to compensate for such rehoming. The documents do not make this
- distinction and need to be clarified.
-
- Dropping the ``storage requirements'' section of Bernhard's document was
- considered, but it was decided to keep it in, since dropping it would
- give the misimpression that the group hadn't thought about the problem.
-
- It had been proposed that the client-server model not be covered in the
- current documents. The reason, in part, was that the original purpose
- of the Working Group was to get the various network operators to produce
- consistent reports that could be compared, not to exchange information,
- and that exchanging information is not required very often.
-
- The data storage format was discussed. The format impacts what will be
- stored and what can be done with it. To reduce storage requirements,
- several people proposed that raw data could be kept for some period of
- time, and then aggregated somewhat and kept for some other period of
- time, and then further aggregated. The proposals differed in the time
- periods, and the form of aggregation. However, it was pointed out that
- although engineering requirements tend to be common, so common
- non-aggregated data will be useful, management requirements tend to
- differ, so common aggregated data is not useful. In the end, it was
- realized that how much data is retained, and how long, are local
- decisions that cannot be standardized.
-
- The data format should support the process that the data will undergo.
- The process was identified as:
-
-
- 1. Collect status data about routers and interfaces.
-
- 2. Collect ``resource'' data, for example, about the mapping of links
- to interfaces.
-
- 3. Process the data to merge 1 and 2, decreasing the quantity of data
- but without loss of information.
-
-
- 2
-
-
-
-
-
- 4. Produce reports from the above reduced data.
-
-
- It was understood that the processing in step 3 would not lead to
- sufficient reduction in quantity to address long term data storage
- problems. However, it was felt that this processing should not be
- combined with the report generation.
-
- Bernhard proposed a raw data format, which was discussed. He will
- incorporate suggestions into his document.
-
- It was suggested that the monthly reports be based on a matrix that
- identified all the variables that would be collected and processing
- functions that could be applied to them. This would not only clearly
- delimit the scope of the report generation process, but would also allow
- new variables to be added easily. However, this approach would not
- support functions that are based on multiple variables, and although the
- matrix could be relatively full, any network operator might select only
- a few possibilities, and worse, the different operators might select
- different sets.
-
- It was felt that the Working Group should recommend a specific polling
- period. Two were on the table; 5 minutes and 15 minutes. Concern was
- expressed that 5 minutes or less might result in excessive overhead or
- be impossible to implement with a poller that polls one router at a
- time. For variables describing link loading, such as bytes transmitted,
- the polling period is a function of the line speed. A one minute
- polling period will miss the interesting peaks of a T1 line, but will
- show the individual packets on a 1200 baud line. For variables not
- describing link loading, such as packets dropped, the polling interval
- can generally be very long, until the value changes, at which time the
- polling period should be shortened to help identify the problem. So it
- may be that a 15 minute polling period is sufficient for anything other
- than link utilization. This discussion was deferred until the next
- meeting on Thursday.
-
- Geoff Huston suggested a different approach. He proposed that the link
- utilization parameter that is most closely correlated to the clients'
- dissatisfaction is the mean standard deviation of inter-packet arrival
- times of evenly spaced (when transmitted) TCP packets. He suggested
- that this parameter explodes as soon as congestion appears.
-
-
-
- Thursday's Session
-
- During the second OPSTAT session the storage format and the polling
- periods were discussed in more detail.
-
- The Storage Format
-
- The placeholder for the header section is suggested to be within the
- log-file. However, there might be useful with both separate and in-band
-
- 3
-
-
-
-
-
- headers. It was expressed the need for multiple header sections within
- one log-file. When closing and reopening the same log-file there is the
- need for close and start time specifications. When changing log-source
- there is the need of specifying a new device. Three delimiter pairs
- were suggested:
-
-
-
- BEGIN_TIME - END_TIME
- BEGIN_DEVICE - END_DEVICE
- BEGIN_DATA - END_DATA
-
-
-
- There are currently two storage formats. The version presented by
- Bernhard Stockman and and earlier version produced by Chris Myers.
- Chris Myers volunteered to produce a second version of his storage
- format strawman.
-
- The generic log data format is:
-
-
-
- timestamp, tag, delta_sample_interval, data1, data2, data3, ..., dataN
-
-
-
- where the tag defines the logged variables.
-
-
- The Polling Period
-
- The reason for the polling is to achieve statistics to serve as base for
- trend and capacity planning. From the operational data it shall be
- possible to derive engineering and management data.
-
- It will not be sufficient with a polling period of 15 minutes to detect
- variations in peak-behavior. It was suggested that a period of maximum
- 1 minute would be needed. Using such a tight polling period will create
- a need for aggregating stored data. Aggregation here means to over a
- period with logged entries, a new aggregated entry is created by taking
- the first and last of the previously logged entries over some
- aggregation period and compute a new entry.
-
- A method of displaying both average and peak-behaviors in the same
- bar-diagram is to compute both the average value over some period and
- the peak value during the same period. The average and peak values are
- then displayed in the same bar.
-
- A problem here is how to aggregate peak values. There is the
- possibility of creating a new peak value being the peak of all the
- peaks, the average of all the peaks, etc.
-
-
-
- 4
-
-
-
-
-
- Another reason for aggregation is the differentiation of needed polling
- periods depending on the reason for and source of the polling.
-
- What is foreseen is that over a relatively short period, polled data
- will be logged at the tightest polling period (1 minute) regularly these
- data will be pre-processed into the actual files being stored. The
- pre-processing may include steps such as the computation of percent
- samples above a certain limit, average of all samples during the
- aggregation period, cumulative histograms. This pre-processing will
- than not only serve as storage compacting but also provide some initial
- statistical processing.
-
- Recommendation on polling period:
-
-
-
- Basic polling period 1 minute (60 seconds).
-
-
-
- Recommendation on aggregation periods:
-
-
-
- Over a
-
- 24 hour period aggregate to 15 minutes,
- 1 month period aggregate to 1 hour,
- 1 year period aggregate to 1 day
-
-
-
- Aggregation is the computation of new average and maximum values for the
- aggregation period based on the previous aggregation period data.
-
- Recommendation for saving periods of logged and aggregated data:
-
-
-
- 15 minute aggregation period saved 1 week.
- 1 hour aggregation period saved 1 month.
- 1 day aggregation period saved 1 year.
-
-
-
- Finally it was decided that, as the current document will not contain
- the protocol specification of the client-server model, it will be
- sufficient to put the comming RFC into the informational track.
-
- Attendees
-
- Vikas Aggarwal aggarwal@jvnc.net
- Miriam Amos Nihart miriam@decwet.zso.dec.com
- Jordan Becker becker@nis.ans.net
-
- 5
-
-
-
-
-
- Robert Blokzijl K13@nikhef.nl
- Steve Bostock steveb@novell.com
- Randy Butler rbutler@ncsa.uiuc.edu
- John Gong jgong@us.oracle.com
- Phillip Gross pgross@nis.ans.net
- Greg Hollingsworth gregh@mailer.jhuapl.edu
- Kathleen Huber khuber@bbn.com
- Geoff Huston g.huston@aarnet.edu.au
- Walter Lazear lazear@gateway.mitre.org
- April Marine april@nisc.sri.com
- Robert Morgan morgan@jessica.stanford.edu
- Dennis Morris morrisd@imo-uvax.dca.mil
- Chris Myers chris@wugate.wustl.edu
- Rebecca Nitzan nitzan@es.net
- Marsha Perrott mlp+@andrew.cmu.edu
- Ron Roberts roberts@jessica.stanford.edu
- Timothy Salo tjs@msc.edu
- Bernhard Stockman boss@sunet.se
- Joanie Thompson joanie@nsipo.nasa.gov
- Claudio Topolcic topolcic@nri.reston.va.us
- Andrew Veitch aveitch@bbn.com
- Wengyik Yeong yeongw@psi.com
- Osmund de Souza osmund.desouza@att.com
-
-
-
- 6
-